Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LogReader: try zst on internal source #32751

Merged
merged 14 commits into from
Jun 14, 2024
Merged

Conversation

sshane
Copy link
Contributor

@sshane sshane commented Jun 14, 2024

Similar to Azure, which allows any extension log files. Now we discover zst logs/qlogs from internal_source (mkv)

Since our current behavior is to have LogReader cause mkv to download every log file we view to the office (useradmin, PlotJuggler, notebooks, etc.), we need to return all potential URLs for bz2 and zst. If we list we still don't always know which will be available.

@gregjhogan do we still want to silently download all logs to the office when viewed with LogReader? This was not the behavior before the refactor, so I think we should be explicit/document it somewhere if we do want this.

@github-actions github-actions bot added the tools label Jun 14, 2024
@sshane sshane marked this pull request as ready for review June 14, 2024 06:56
Comment on lines +154 to +155
def internal_source_zst(sr: SegmentRange, mode: ReadMode, file_ext: str = "zst") -> LogPaths:
return internal_source(sr, mode, file_ext)
Copy link
Contributor Author

@sshane sshane Jun 14, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If one or the other doesn't exist, we'll spend ~0.001s extra on one HEAD request to the first log file in the segment range, not bad

@sshane sshane changed the title LogReader: internal source lists available files LogReader: internal source supports zst Jun 14, 2024
@sshane sshane changed the title LogReader: internal source supports zst LogReader: try zst on internal source Jun 14, 2024
@sshane sshane merged commit b45caf4 into master Jun 14, 2024
18 checks passed
@sshane sshane deleted the lr-internal-source-listing branch June 14, 2024 08:12
@gregjhogan
Copy link
Member

For downloading to mkv, the important thing is that people iterating on a big list of segments doesn't unknowingly repeatedly download the data (which costs money and will slow down their work). Maybe we can come up with a different way to manage this.

If we list we still don't always know which will be available.

We probably need to make some changes if this is now a concern. The idea that LogReader can't simply operate on files that can be listed sounds like a problem to me.

@gregjhogan
Copy link
Member

I am in favor of only using the internal source when it has the files LogReader needs if that simplifies things, then find a way to exclusively use the internal source when egress bandwidth cost is a concern.

Edison-CBS pushed a commit to Edison-CBS/openpilot that referenced this pull request Sep 15, 2024
* internal source list files like azure api

* messy but works

* no limit

* simpler

* clean up

* clean up

* clean up

* that's obvious

* better

* we need to unfortunately return a url, so best to take a naive approach for now

* todo

* fix

* clean up
old-commit-hash: b45caf4
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants